Retrieval Augmented Generation – Part 1

Charles F. Vardeman II

Center for Research Computing, University of Notre Dame

2024-01-19

Hypothsis: Retrieval Augmented Generation Requires Curation

Knowledge Engineering Using Large Language Models

Allen, Bradley P, Lise Stork, and Paul Groth. 2023. “Knowledge Engineering Using Large Language Models.” arXiv.Org. October 1, 2023. https://arxiv.org/abs/2310.00637

Prompt Engineering as Knowledge Engineering

Allen, Bradley P, Lise Stork, and Paul Groth. 2023. “Knowledge Engineering Using Large Language Models.” arXiv.Org. October 1, 2023. https://arxiv.org/abs/2310.00637

Knowledge Engineering Practice

Allen, Bradley P, Lise Stork, and Paul Groth. 2023. “Knowledge Engineering Using Large Language Models.” arXiv.Org. October 1, 2023. https://arxiv.org/abs/2310.00637

Trusted AI, LLMs and KE

[Allen, Bradley P, Lise Stork, and Paul Groth. 2023. “Knowledge Engineering Using Large Language Models.” arXiv.Org. October 1, 2023. https://arxiv.org/abs/2310.00637]

Retrieval-Augmented Generation for Large Language Models: A Survey

Gao, Yunfan, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, et al. 2024. “Retrieval-Augmented Generation for Large Language Models: A Survey.” arXiv. https://doi.org/10.48550/arXiv.2312.10997.

Retrieval Augmented Generation – The Idea

Gao, Yunfan, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, et al. 2024. “Retrieval-Augmented Generation for Large Language Models: A Survey.” arXiv. https://doi.org/10.48550/arXiv.2312.10997.

Naive RAG

  • Indexing

    • Data Indexing: Cleaning and extracting data from PDF, HTML, Word, Markdown, Images

    • Chunking: Dividing text into smaller chunks for LLM limited context window

    • Embedding and Creating Index: Encoding text/images into vectors through a language model

  • Retrieve: Given a user input, retrieve relevant information

  • Generation: The user query to the LLM and related documents from retrieval are combined into a new prompt. The LLM generates a response based on this new context window.

Naive RAG Architecture

Langchain Q&A with RAG

“Chunking”

“Chunkviz”

“Chunking” with Overlap

“Chunkviz”

Smarter “Chunking”

LangChain - Recursively Split by Character

“Chunking” recursive character splitter

“Chunkviz”

“Chunking” with larger segment size

“Chunkviz”

Vector Indexing of the “Chunks”

Problem: A global constant for Chunk Size doesn’t take into account the semantic structure of a document.

“Agentic” Chunking

LangChain on X: Proposition-Based Retrieval

Agentic Example: Proposition Based Dense Retrieval

Chen, Tong, Hongwei Wang, Sihao Chen, Wenhao Yu, Kaixin Ma, Xinran Zhao, Hongming Zhang, and Dong Yu. 2023. “Dense X Retrieval: What Retrieval Granularity Should We Use?” arXiv.Org. December 11, 2023. https://arxiv.org/abs/2312.06648v2.

RAG Complexity Overview

Gao, Yunfan, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, et al. 2024. “Retrieval-Augmented Generation for Large Language Models: A Survey.” arXiv. https://doi.org/10.48550/arXiv.2312.10997.

Comparison with other optimization methods